Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flask apps only signal an exception on real server errors #1611

Merged
merged 2 commits into from
Dec 7, 2022

Conversation

enerqi
Copy link
Contributor

@enerqi enerqi commented Nov 15, 2022

Fixes the problem of non-exceptional "exceptions" being recorded by telemetry systems as a serious error. This expands on the changes made in #1326. The intention of that other change seems to be making telemetry systems like Sentry record serious errors. Diving into the Sentry implementation even shows that the signalled exception is recorded at "level" = "error".

The root of the problem is that exceptions are being used for control flow, which is not ideal - convenient for app writers but not always for library maintainers. The connexion BadRequestProblem and NotFoundProblem, for example, should not be recorded as an error in a telemetry system. In my case elastic-apm-python is receiving these signals and recording 4xx events as serious errors.

This pull request only propagates an exception signal to flask if it's a serious error.

Aiohttp applications have a similar problem with exceptions being used for control flow - the problems middleware will convert the exception into an appropriate problem response but things like the elastic apm python telemetry middleware will see that exception and record it as a serious error. Interestingly aiohttp also uses exceptions for control flow and the elastic apm agent was patched to specifically ignore aio web exceptions below the 5xx status response range. Elastic apm and sentry cannot be expected to be aware of non-serious control flow exceptions used by various libraries though. So, a solution for Aiohttp applications is a separate problem.

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @enerqi, seems logical to follow Flask here.

@@ -85,6 +88,9 @@ def common_error_handler(self, exception):
headers=exception.get_headers(),
)

if is_server_error:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this with if response["status"] >= 500?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is .status_code on the "ConnexionResponse" type.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 3469952874

  • 4 of 5 (80.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.02%) to 94.314%

Changes Missing Coverage Covered Lines Changed/Added Lines %
connexion/apps/flask_app.py 4 5 80.0%
Totals Coverage Status
Change from base Build 3396121566: -0.02%
Covered Lines: 2936
Relevant Lines: 3113

💛 - Coveralls

@spec-first spec-first deleted a comment from coveralls Nov 24, 2022
@RobbeSneyders RobbeSneyders merged commit a829536 into spec-first:main Dec 7, 2022
RobbeSneyders pushed a commit that referenced this pull request Jan 24, 2023
Fixes the problem of non-exceptional "exceptions" being recorded by
telemetry systems as a serious error. This expands on the changes made
in #1326. The intention of that other change seems to be making
telemetry systems like Sentry record serious errors. Diving into the
Sentry implementation even shows that the signalled exception is
recorded at "level" = "error".

The root of the problem is that exceptions are being used for control
flow, which is not ideal - convenient for app writers but not always for
library maintainers. The connexion BadRequestProblem and
NotFoundProblem, for example, should not be recorded as an error in a
telemetry system. In my case
[elastic-apm-python](https://github.com/elastic/apm-agent-python) is
receiving these signals and recording 4xx events as serious errors.

This pull request only propagates an exception signal to flask if it's a
serious error.

Aiohttp applications have a similar problem with exceptions being used
for control flow - the problems middleware will convert the exception
into an appropriate problem response but things like the elastic apm
python telemetry middleware will see that exception and record it as a
serious error. Interestingly aiohttp also uses exceptions for control
flow and the elastic apm agent was patched to specifically ignore aio
web exceptions below the 5xx status response range. Elastic apm and
sentry cannot be expected to be aware of non-serious control flow
exceptions used by various libraries though. So, a solution for Aiohttp
applications is a separate problem.

Co-authored-by: Enerqi <>
jcristau added a commit to jcristau/balrog that referenced this pull request Jun 12, 2023
jcristau added a commit to jcristau/balrog that referenced this pull request Jun 12, 2023
jcristau added a commit to jcristau/balrog that referenced this pull request Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants